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(57) Abstract 

The invention pertains to a process for producing transgenic plants with increased nutritional value. It comprises: cultivat- 
ing plants obtained from regenerated plant cells or from seeds of plants obtained from said regenerated plant cells over one or 
several generations, whose genetic patrimony, replicable with said plants, comprises a precursor-coding nucleic acid sequence en- 
coding the precursor of a 2S albumin storage protein and placed under the control of a promoter capable of directing gene ex- 
pression in plants, said precursor-coding nucleic acid being modified in a nonessential region of its relevant sequence which en- 
codes the mature 2S albumin or a subunit thereof with a nucleic acid insert in appropriate reading frame relationship with the 
surrounding part of said relevant sequence, said insert including a determined segment encoding an heterologous determined pol- 
ypeptide containing appropriate aminoacid such as lysine and/or methionine and/or threonine and/or phenylalanine and/or 
trytophane and/or leucine and/or valine and/or isoleucine. 
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A process for the production of transgenic plants 
with increased nutritional value via the 
expression of modified 2S storage albumins 

This invention relates to a process for the production 
of plants with increased content of appropriate aminoacids 
having high nutritional properties through the modification 
of plant genes encoding plant storage proteins, more particu- 
larly the 2S albumins. 

More particularly, the invention aims at providing 
genetically modified plant DNA and plant live material in- 
cluding said genetically modified DNA replicable with the 
cells of said plant material, which genetically modified 
plant DNA contains sequences encoding for a polypeptide 
containing said appropriate aminoacids which expression is 
under the control of a suitable plant promoter. 

A further object of the invention is to take advantage 
of the capacity of 2S albumins to be produced in large 
amounts in plants* 

A further object of the invention is to take advantage 
of a hypervariable region of the 2S albumins, which supple- 
mentation with a number of said appropriate aminoacid codons 
in said hypervariable region of the gene encoding said 2S 
albumins, do not disturb the correct expression, processing 
and transport of said produced modified storage proteins in 
the protein bodies of the plants. 

Animals and men obtain directly or indirectly their 
essential aminoacids by eating plants. These essential 
aminoacids include lysine, thryptophane, threonine, methion- 
ine, phenylalanine, leucine, valine and isoleucine. For the 
easiness of the language these aminoacids are called "appro- 
priate aminoacids". Rather recently, agricultural scien- 
tists concerned with the world's hungry problem, concentrat- 
ed their work on developing plants with high nutritional 
yield. These new varieties , obtained through breeding in 
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the most cases, were richer in carbohydrates but usually 
poorer in essential proteins than the wild type varieties 
from which they were derived. Currently, increasing recogni- 
tion of the role of plants in supplying essential aminoacids 
to the animal world had led to emphasis on the development 
of new food plants having a better aminoacid content. 
Classical breeding however has limitations for achieving 
this goal. Molecular genetics, on the contrary/ offers a 
possibility to overcome these difficulties. Reference is 
made to the European patent application 80208418 and the 
communication of Brown et al., 1986, in which a gene encod- 
ing a corn seed storage protein, (the so called zeins) is 
modified by the addition of sequences encoding lysine 
codons. 

Seed storage proteins represent up to 90% of total seed 
protein in seeds of many plants. They are used as a source 
of nutrition for young seedlings in the period immediately 
after germination. The genes encoding them are strictly 
regulated, being expressed in a highly tissue specific and 
stage specific fashion (Walling et al., 1986; Higgins, 
1984) . Thus they are expressed almost exclusively in devel- 
oping seed, and different classes of seed storage proteins 
may be expressed at different stages in the development of 
the seed. They are generally restricted in their intercellu- 
lar location, being stored in membrane bound organelles 
called protein bodies or protein storage vacuoles. These 
organelles provide a protease-free environment, and often 
also contain protease inhibitors. A related group of pro- 
teins, the vegetative storage proteins, have similar ami- 
noacid compositions and are also stored in specialized vac- 
uoles, but are found in leaves instead of in seeds 
(Staswick, 1988) . These proteins are degraded upon flower- 
ing, and are thought to serve as a nutritive source for 
developing seeds. 
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The expression of foreign genes in plants is well estab- 
lished (Oe Blaere et al., 1987). In several cases seed stor- 
age protein genes have been transferred to other plants. In 
most of these cases it was shown that within its new environ- 
ment the transferred seed storage protein gene is expressed 
in a tissue specific and developmentally regulated manner 
(Beachy et al., 1985; Sengupta-Gopalan et al., 1985; Karris 
et al., 1988; Ellis et al., 1988; Higgins et al., 1986, Oka- 
muro et al., 1986). It has also been shown in at least two 
cases that foreign seed storage proteins are located in the 
protein bodies of the host plant (Greenwood and Chrispeels, 
1985; 

Hoffman et al., 1987). It has further been shown that stable 
and functional messenger BNA's can be obtained if a cDNA, 
rather than a complete gene including introns, is used as 
the basis for the chimeric gene (Chee et al., 1986). 

Storage proteins are generally classified on the basis 
of solubility and size (more specifically sedimentation 
rate, for instance as defined by Svedberg (in Stryer, L. , 
Biochemistry, 2nd ed., w.H. Freeman, New York, page 599)). A 
particular class of seed storage proteins has been studied, 
the 2S seed storage proteins, which are water soluble albu- 
mins. They represent a significant proportion of the seed 
storage proteins in many plants (Youle and Huang, 1981) 
(Table I) and their small size and consequently simpler 
structure makes them an attractive target for modification 
(see also patent application EP 87 402 348.4). Several 2S 
storage proteins have been characterized at either the pro- 
tein, cDNA or genomic clone levels (Crouch et al., 1983; 
Sharief and Li, 1982; Ampe et al., 1986; Altenbach et al., 
1987; Ericson et al., 1986; De Castro et al., 1987; Scofield 
and Crouch, 1987; Josefsson et al., 1987; EP 87.4023484, 
Krebbers et al., 1988). 2S albumins are formed in the cell 
from two subunits of 6-9 and 3-4 kilodaltons (kd) respective- 
ly, which are linked by disulfide bridges. 
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The work in the references above shoved that 2S albu- 
mins are synthesized as complex prepropeptides whose organi- 
zation is shared between the 2S albumins of many different 
species and are shown diagrammatical ly for three of these 
5 species in figure 1. Several complete sequences are shown 

in figure 2. 

As to Fig. 2 relative to protein sequences of 2S albu- 
mins, the following observations are made. For £• naBJIS* B. 
excelsia . and A. thaliana both the protein and DNA sequences 

10 have been determined, for £• communis only the protein se- 
quence is available (£. napus from Crouch et al., 1983 and 
Ericson et al., 1986; fi. excelsia from Ampe et al., 1986, De 
Castro et al., 1987 and Altenbach et al., 1987, fi. communis 
from Sharief and Li, 1982). Boxes indicate homologies, and 

15 raised dots the position of the cysteines. 

Comparison of the protein sequences at the beginning of 
the precursor with standard consensus sequences for signal 
peptides reveals that the precursor has not one but two 
segments at the amino terminus which are not present in the 

20 mature protein, the first of which is a signal sequence 

(Perlman and Halvorson, 1983) and the second of which has 
been designated as the amino terminal processed fragment (the 
so-called ATPF) • Signal sequences serve to ensure the co- 
translational transport of the nascent polypeptide across the 

25 membrane of the endoplasmic reticulum (Blobel, 1980), and are 

found in many types of proteins, including all seed storage 
proteins examined to date (Herman et al., 1986). This is 
crucial for the appropriate compartmentalization of the pro- 
tein. The protein is further folded in such a way that cor- 

30 rect disulfide bridges are formed. This process is probably 

localized at the luminal site of the endoplasmatic reticulum 
membrane, where the enzyme disulfide isomerase is localized 
(Roden et al., 1982; Bergman and Kuehl, 1979). After translo- 
cation across the endoplasmic reticulum membrane it is 

35 thought that most storage proteins are transported via said 
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endoplasmic reticulum to the Golgi bodies, and from the lat- 
ter in small membrane bound vesicles ("dense vesicles") to 
the protein bodies (Chrispeels, 1983; Craig and Goodchild, 
1984; Lord, 1985). That the signal peptide is removed co- 
translationally implies that the signals directing the fur- 
ther transport of seed storage proteins to the protein bodies 
must reside in the remainder of the protein sequence 
present. Zeins and perhaps some other prolaminins deviate 
from this pathway; Indeed the protein bodies are formed by 
budding directly off of the endoplasmic reticulum (Larkins 
and Hurkman, 1978). As already of record, 2S albumins contain 
sequences at the amino end of the precursor other than the 
signal sequence which are not present in the mature polypep- 
tide* This is not general to all storage proteins. This amino 
terminal processed fragment is labeled ATPF in figure 1. 

In addition, as shown in figure 1, several aminoacids 
located between the small and large subunits in the precursor 
are removed (labeled IPF in the figure, which stands for 
internal processed fragment) . Furthermore, several residues 
are removed from the carboxyl end of the precursor (labeled 
CTPF in the figure which stands for carboxyl terminal pro- 
cessed fragment) . The cellular location of these latter pro- 
cessing steps is uncertain, but is most likely the protein 
bodies (Chrispeels et al., 1983; Lord, 1985). As a result of 
these processing steps the small subunit and the large sub- 
unit remain. These are linked by disulfide bridges, as dis- 
cussed below. 

When the protein sequences of 2S albumins of different 
plants are compared strong structural similarities are ob- 
served. This is more particularly illustrated by figure 2 
which provides the aminoacid sequences of the small subunit 
and large subunit respectively of representative 2S storage 
seed albumin proteins of different plants, i.e.,: 
R. comm. : Eifiinus gfifflffllUlig 
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A. thali . : Arabldopsis thaliana 

B. napus : Brasslca napilfi 

B. excel.: Bertholletia excelsia (Brazil nut) 
It must be noted that in Fig. 2: 

the aminoacid sequences of said subunits extend on 
several lines; the cysteine groups of the aminoacid 
sequences of the exemplified storage proteins and iden- 
tical aminoacids in several of said proteins have been 
brought into vertical alignment; the hyphen signs which 
appear in some of these sequences represent absent 
aminoacids, in other words direct linkages between the 
closest aminoacids which surrounded them; 
the aminoacid sequences which in the different proteins 
are conserved are framed. 

It will be observed that all the sequences contain 
eight cysteine residues (the first and second in the small 
subunit, the remainder in the large subunit) which could 
participate in disulfide bridges as diagrammatically shown 
in Fig. 3, which represents a hypothetical model (for the 
purpose of the present discussion) rather than a representa- 
tion of the true structure of the 2S albumin of Arab i dons is 

thali ana- 

Said hypothetical model has been inspired by the dis- 
ulfide bridge mediated loop-formation of animal albumins, 
such as serum albumins (Brown, 1976) , alpha-fetoprotein 
(Jagodzinski et al., 1987; Morinaga et al., 1983) and the 
vitamine D binding protein where analogous constant C-C 
doublets and C-X-C triplets were observed (Yang et al., 
1985) . 

As can be seen on Fig. 2, the regions which are interca- 
lated between the first and second cysteines, between the 
fifth and sixth cysteines, and between the seventh and eight 
cysteines of the mature protein show a substantial degree of 
conservation or similarity. It would thus seem that these 
regions are in some way essential for the proper folding 
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and/or stability of the protein when synthesized in the 
plants. An exception to this conservation consist in the 
distance between the sixth and seventh 

cysteine residues. This suggests that these arrangements are 
structurally important, but that some variation is permissi- 
ble in the large subunit between said sixth and seventh cys- 
teines where little conservation of aminoacids is observed. 
An analogous suggestion has been made by Slightom and Chee 
(1987), where the viciline type seed storage proteins from 
peas were compared. These authors indeed suggest that ami- 
noacid replacement mutations designed to increase the number 
of sulphur containing aminoacids should be placed in regions 
which show little or no conservation of aminoacid sequences. 
The authors however conclude that the proof that such modifi- 
cations can be tolerated will need to be tested in the seeds 
of transgenic plants. Moreover, the teaching provided in 
their paper on the properties of the through deletion modi- 
fied storage protein concerns only the influence on expres- 
sion levels and not on processing of said storage proteins. 

An embodiment of this invention is the demonstration 
that a well chosen region of the 2S albumin allows variation 
without altering the properties and correct processing of 
said modified storage protein in plant cells of transgenic 
plants . 

This region ( diagrammatical ly shown in Fig. 3 by an 
enlarged hatched portion) will in the examples hereafter 
referred to be termed as the "hypervariable region". Fig. 3 
also shows the respective positions of the other parts of the 
precursor sequence, including the "IPF" section separating 
the small subunit and large subunit of the precursor, as well 
as the number of aminoacids (aa) in substantially conserved 
portions of the protein subunits cysteine residues. The pro- 
cessing cleavage sites (as determined by Krebbers et al., 
1988) are shown by symbols. 
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The seeds of many plants contain albumins of approximate- 
ly the same size as the storage proteins discussed above. 
However, for ease of language, this document will use the 
term "2S albumins' 9 to refer to seed proteins whose genes 
encode a peptide precursor with the general organization 
shown in figure 1 and which are processed to a final form 
consisting of two subunits linked by disulfide bridges. The 
process of the invention for producing plants with an in- 
creased content of appropriate aminoacids comprises : 

cultivating plants obtained from regenerated plant cells 
or from seeds of plants obtained from said regenerated 
plant cells over one or several generations, wherein the 
genetic patrimony or information of said plant cells, 
replicable within said plants, includes a nucleic acid 
sequence, placed under the control of a plant promoter, 
which can be transcribed into the mRNA encoding at least 
part of the precursor of a 2S albumin including the 
signal peptide of said plant, said nucleic acid being 
hereafter referred to as the "precursor encoding nucleic 
acid" 

• wherein said nucleic acid contains a nucleotide se- 
quence (hereafter termed the "relevant sequence") which 
relevant sequence comprises a nonessential region modi- 
fied by a heterologous nucleic acid insert forming an 
open reading frame in reading phase with the non modi- 
fied parts surrounding said insert in said relevant 
sequence. 

wherein said insert includes a nucleotide segment 
encoding a polypeptide containing appropriate ami- 
noacids. 

It will be appreciated that under the above mentioned 
conditions each and every cell of the cultivated plant will 
include the modified nucleic acid. Yet the above defined 
recombinant or hybrid sequence will be expressed at high 
levels constitutively or only or mostly in certain organs of 
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the cultivated plants dependent on which plant promoter has 
been chosen to conduct its expression. In the case of 
seed-specific promoters the hybrid storage protein will be 
produced mostly in the seeds. 

It will be understood that the "heterologous nucleic 
acid insert 11 defined above consists of an insert which con- 
tains nucleotide sequences which at least in part, may be 
foreign to the natural nucleic acid encoding the precursor of 
the 2S albumins of the plant cells concerned and encode the 
appropriate aminoacids. Most generally the segment encoding 
polypeptide containing said appropriate aminoacids will it- 
self be foreign to the natural nucleic acid encoding the 
precursor of said storage protein. Nonetheless, the term 
M heterologous nucleic acid insert" does also extend to an 
insert containing a segment as above-defined normally present 
in the genetic patrimony or information of said plant cells, 
the "heterologous 19 character of said insert then addressing 
to the different genetic environment which surrounds said 
insert. 

In the preceding definition of the process according to 
the invention the so-called "nonessential region" of the 
relevant sequence of said nucleic acid encoding the precur- 
sor, consists of a region whose nucleotide sequence can be 
modified either by insertion into it of the above defined 
insert or by replacement of at least part of said nonessen- 
tial region by said insert, yet without disturbing the stabil- 
ity and correct processing of said hybrid storage protein as 
well as its transport into the above-said protein bodies. 
Sequences consisting of said insert or replacement and repre- 
senting the coding region for a polypeptide containing appro- 
priate aminoacids can either be put in as synthetic oligomers 
or as restriction fragments isolated from other genes, as 
thought by Brown, 1986. The total length of the hybrid stor- 
age protein may be longer or shorter than the total length of 
the non-modified 2S albumin. 
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With respect to the choice of the region to be modified, 
the present invention is clearly distinguishable from other 
work which has been done in this field. Reference is made to 
DD-A-240911 patent from the Akademie der Wissenschaften der 
DDR where legumin genes from Zifiifl lata, (glutine and prola- 
mine) were modified in vitro with sequences encoding methion- 
ine. As place of insert a natural occurring PstI site has 
been chosen. At the EMBO workshop "Plant storage protein 
genes" , (Breisach, FRG, September 1986) the authors presented 
their work and informed the audience that plant transforma- 
tion experiments were just started with the modified gene. 
Ho further results have yet been published. 

Reference is also made to patent application 
WO-A-87/07299 and corresponding publication of Radke et al., 
1988. These papers describe the modification of the napin 
gene, which encodes the 2S albumin of Brass ica napus. by a 
nucleotide sequence encoding nine aminoacid residues includ- 
ing 5 consecutive methionines. The region of modification is 
a naturally occurring SstI site within the region encoding 
the mature protein. Such a modification would result in a 
insertion directly adjacent to a cysteine residue and more- 
over in a region between two cysteines, namely the 4th and 
the 5th cysteines of the mature protein which correspond with 
the 2nd and 3rd cysteines of the large subunit, whose length 
is strongly conserved (see above) . We believe such a modifi- 
cation is likely to disrupt a normal folding and stability of 
the 2S albumin (see also EP 87 402 348.4). Moreover, above 
cited references provide no evidence that the desired modi- 
fied 2S albumin was successfully synthesized, correctly pro- 
cessed or correctly targeted. 

In the present invention the precursor-coding nucleic 
acid referred to above may of course originate from the same 
plant species as that which is cultivated for the purpose of 
the invention. It may however originate from another plant 
species, in line with the teachings of Beachey et al., 1985 
and Okamuro et al., 1986 already of record. 
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In a similar manner the plant promoter may originate 
from the same plant species or from another, subject in the 
last instance to the capability of the host plant 1 s polymeras- 
es to recognize it. It may act const i tut ively or in a 
tissue-specific manner, such as, but not limited to, 
seed-specific promoters. 

Regions such as the ones at the end of the small sub- 
unit, at the beginning or end of the large subunit, shov 
differences of such a magnitude that they can be held as 
presumably having no substantial impact on the final proper- 
ties of the protein. The extreme carboxyl terminus of the 
small subunits and the amino terminus of the large subunit 
may, however, be Involved in the processing of the internal 
processed fragment. A region which does not seem essential, 
consists of the middle position of the region located in the 
large subunit, between the sixth and the seventh cysteine of 
the nature protein, but not immediately adjacent and at least 
3 aminoacids separated from said cysteines. Thus in addition 
to the absence of similarity at the level of the aminoacid 
residues, there appears a difference in length which makes 
that region eligible for substitutions in the longest 2S 
albumins and for addition of aminoacids in the shortest 2S 
albumins or for elongation of both. The same should be appli- 
cable at approximately of the end of the first third part of 
the same region between said sixth and seventh cysteine; see 
the sequence of £. communis which is much shorter at that 
region than the corresponding regions of the other exempli- 
fied 2S proteins. 

It is of course realized that caution must be exorcised 
against hypotheses based on arbitrary choices as concerns the 
bringing into line of similar parts of proteins which else- 
where exhibit substantial differences. Nevertheless such 
comparisons have proven in other domains of genetics to pro- 
vide the man skilled in the art with appropriate guidance to 
reasonably infer from local structural differences, on the 
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one hand, and from local similarities, on the other hand, in 
similar proteins of different sources, which parts of such 
proteins can be modified and which parts cannot, when it is 
sought to preserve some basic properties of the non modified 
protein in the same protein yet locally modified by a foreign 
or heterologous sequence* 

The choice of the adequate nonessential regions to be 
used in the process of the invention will also depend on the 
length of the polypeptide containing the appropriate ami- 
noacids. Basically the method of the invention allows the 
modification of said 2S albumins by the insertion and/or 
partial substitution into the precursor nucleic acid of se- 
quences encoding up to 100 aminoacids. 

When the complete protein sequence of the region to be 
inserted into a 2S albumin has been determined, the nucleo- 
tide sequence to encode said protein sequence must be deter- 
mined. It will be recognized that while perhaps not absolute- 
ly necessary the codon usage of the encoding nucleic acid 
should where possible be similar to that of the gene being 
modified. 

The person skilled in the art will have access to appropriate 
computer analysis tools to determine said codon usage. 
Any appropriate genetic engineering technique may be used for 
substituting the insert for part of the selected 
precursor-coding nucleic acid or for inserting it in the 
appropriate region of said precursor-coding nucleic acid. The 
general In vitro recombination techniques followed by cloning 
in bacteria can be used for making the chimeric genes. 
Site-directed mutagenesis can be used for the same purposes 
as further exemplified hereafter. DNA recombinants, e.g. 
plasmids suitable for the transformation of plant cells can 
also be produced according to techniques disclosed in current 
technical literature. The same applies finally to the produc- 
tion of transformed plant cells in which the hybrid storage 
protein encoded by the relevant parts of the selected 
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precursor-coding nucleic acid can be expressed. By way of 
example, reference can be made to the published European 
applications no. 116 718 or to international application WO 
84/02913 and, which disclose appropriate techniques to that 
effect. 

When designing the sequences rich in appropriate ami- 
noacids, care must be taken that the resulting peptide con- 
taining said appropriate aminoacids does not influence the 
stability of the modified 2S albumin. Certain insertions may 
indeed disrupt the structure of the protein. Por example, 
long stretches of methionines may result in rod shaped heli- 
ces which would result in instabilities due to disruption of 
normal folding patterns. Thus such sequences must occasional- 
ly include aminoacids which interrupt the helical structure. 

The procedures which have been disclosed hereabove apply 
to the adequate modification of the nonessential region of any 
of 2S albumins by an heterologous insert containing a DNA 
sequence encoding a peptide containing appropriate aminoacids 
with nutritional properties and then to the transformation of 
the relevant plants with the chimeric gene obtained for the 
production of a hybrid protein containing the sequence of said 
peptide in the cells of the relevant plant. Needless to say 
that the person skilled in the art will in all instances be 
able of selecting which of the existing techniques would at 
best fulfill its needs at the level of each step of the produc- 
tion of such modified plants, to achieve the best production 
yields of said hybrid storage protein. 

For instance the following process can be used in order 
to exploit the capacity of a 2S albumin, to be used as a suit- 
able vector for the production of plants with increased nutri- 
tional value, by inserting in said 2S albumins nucleotide 
codons encoding methionine and/or lysine and/or thryptophane 
and/or threonine and/or phenylalanine and/or leucine and/or 
valine and/or isoleucine when the corresponding 
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precursor-coding nucleic acid has been sequenced. Such process 
then comprises: 

1) locating and selecting one of said relevant sequences 
of the precursor-coding nucleic acid which comprises a 
nonessential region encoding a peptide sequence which can 
be modified by substituting an insert for part of it or 
by inserting of said insert into it, which modification 
is compatible with the conservation of the configuration 
of said 2S albumins and this preferable by determining 
the relative positions of the codons which encode the 
successive cysteine residues in the mature protein or 
protein subunits of said 2S albumins and identifying the 
corresponding successive nucleic acid regions located 
upstream of, between, and downstream of said codons with- 
in said sub-sequences of the precursor-coding nucleic 
acid and identifying in said successive regions those 
parts which undergo variability in either aminoacid se- 
quence or length or both from one plant species to anoth- 
er as compared with those other regions which do exhibit 
substantial conservation of aminoacid sequence in said 
several plant species, one of said nucleotide regions 
being then selected for the insertion therein of the 
nucleic acid insert as described hereunder. 

An alternative would consist of studying any 3-D struc- 
tures which may become available in the future. 

2) inserting a nucleic acid insert in the selected region 
of said precursor nucleic acid in appropriate reading 
frame relationship with the non-modified parts of said 
relevant sequence, which insert includes a determined 
segment encoding a peptide containing all or part of the 
above mentioned appropriate aminoacids. 

3) inserting the modified precursor-coding nucleic acid 
obtained in a plasmid suitable for the transformation of 
plant cells which can be regenerated into full 
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seed- forming plants, wherein said insertion is brought 
under the control of regulation elements, particularly a 
plant promoter capable of providing for the expression 
of the open reading- frames associated therewith in said 
plants; 

4) transforming a culture of such plant cells with such 
modified plasmid; 

5) assaying the expression of the chimeric gene encoding 
the hybrid storage protein and, when achieved; 

6) regenerating said plants from the transformed plant 
cells obtained and growing said plants up to maturity. 

In the case the chimeric gene is under the control of a 
seed specific promotor, growing up the transformed plants to 
seeds must precede step 5) 

Hence embodiment as described under 1) of the invention 
hereabove provides that in having the hybrid 2S albumins in a 
plant, it will pass the plant protein disulfide isomerase 
during membrane translocation, thus increasing the chances 
that the correct disulfide bridges be formed in the hybrid 
precursor as in its normal precursor situation, on the one 
hand 

The invention further relates to the recombinant nucleic 
acids themselves for use in the process of the invention; 
particularly to the 

- recombinant precursor encoding nucleic acid defined 
in the context of said process; 

- recombinant nucleic acids containing said modified 
precursor encoding nucleic acid tinder the control of 
a plant promoter, whether the latter originates from 
the same DNA as that of said precursor coding nucleic 
acid or from another DMA of the same plant from which 
the precursor encoding nucleic acid is derived, or 
from a DNA of another plant, or from a non-plant 
organism provided that it is capable of directing 
gene expression in plants. 
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- vectors, more particularly plant plasmids e.g., 
Ti-derived plasmids modified by any of the preceding 
recombinant nucleic acids for use in the transforma- 
tion of the above plant cells. 
The invention also relates to the regenerable source of 
the hybrid 2S albumin, which is formed of in the cells of a 
seed-forming-plant, which plant cells are capable of being 
regenerated into the full plant or seeds of said seed-forming 
plants wherein said plants or seeds have been obtained as a 
result of one or several generations of the plants resulting 
from the regeneration of said plant cells, wherein further 
the DNA supporting the genetic information of said plant 
cells or seeds comprises a nucleic acid or part thereof, 
including the sequences encoding the signal peptide, which 
can be transcribed in the mRNA corresponding to the precursor 
of a 2S albumin of said plant, placed under the control of a 
plant specific promoter, and 

. wherein said nucleic acid sequence contains a relevant 
modified sequence encoding the mature 2S storage protein 
or one of the several sub-sequences encoding for the 
corresponding one or several sub-units of said mature 2S 
albumins, 

. wherein further the modification of said relevant 
sequence takes place in one of its nonessential regions 
and consists of a heterologous nucleic acid insert form- 
ing an open-reading frame in reading phase with non 
modified parts which surround said insert in the rele- 
vant sequence, 

. wherein said insert consists of a nucleotide segment 
encoding a peptide containing methionine and/or lysine 
and/or thryptophane and/or threonine and/or phenylala- 
nine, and/ or leucine and/or valine and/ or isoleucine. 
It is to be considered that although the invention 
should not be deemed as being limited thereto, the nucleic 
inserts encoding the above mentioned appropriate aminoacids 
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will in most instances be man-made synthetic oligonucleotides 
or oligonucleotides derived from procaryotic or eucaryotic 
genes or of from cDNAs derived of procaryotic or eucaryotic 
RNAs, all of which shall normally escape any possibility of 
being inserted at the appropriate places of the plant cells 
or seeds of this invention through biological processes, 
whatever the nature thereof. In other words, these inserts 
are "non plant variety specific", specially in that they can 

be inserted in different kinds of plants which are genetical- 
ly totally unrelated and thus incapable of exchanging any 
genetic material by standard biological processes, including 
natural hybridization processes. 

Thus the invention further relates to the seed forming 
plants themselves which have been obtained from said trans- 
formed plant cells or seeds, which plants are characterized 
in that they carry said hybrid precursor-coding nucleic acids 
associated with a plant promoter in their cells, said inserts 
however being expressed and the corresponding hybrid protein 
produced in the cells of said plants. 

There follows an outline of a preferred method which can 
be used for the modification of a 2S albumin gene and its 
expression in the seeds obtained from the transgenic plants. 
The outline of the method given here is followed by a specif- 
ic example. It will be understood from the person skilled in 
the art that the method can be suitably adapted for the modi- 
fication of other 2S albumin genes. 

1. Replacement or supplementation of the hypervariable 
region of the 2S albumin gene by a sequence encoding 
peptide containing appropriate aminoacids which possess 
nutritional properties. 

Either the cDNA or the genomic clone of the 2S albumin 
can be used. Comparison of the sequences of the hypervariable 
regions of the genes in figure 2 shows that they vary in 
length. Therefore if the sequence encoding a peptide contain- 
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ing the appropriate aminoacids is short and a 2S albumin with 
a relatively short hypervariable region is used, said se- 
quence of interest can be inserted. Otherwise part of the 
hypervariable region is removed, to be replaced by the insert 
containing a larger segment or sequence encoding the peptide 
containing the appropriate aminoacids. In either case the 
modified hybrid 2S albumin may be longer than the native 
one. In either case two standard techniques can be applied; 
convenient restriction sites can be exploited, or mutagenesis 
vectors (e.g. Stanssens et al. 1987) can be used. In both 
cases, care must be taken to maintain the reading frame of 
the message. 

The sequence encoding the signal peptide of the precur- 
sor of the storage protein used either belongs to this precur- 
15 sor or can be a substitute sequence coding for the signal 
peptide or peptides of an heterologous storage protein. 
2. The altered 2S albumin coding region is placed under the 
control of a plant promoter. Preferred promoters in- 
clude the strong constitutive exogeneous plant promoters 
such as the promoter from cauliflower mozaic virus di- 
recting the 35S transcript (Odell, J.T. et al., 1985), 
also called the 35S promoter; the 35S promoter from the 
CAMV isolate Cabb-Jl (Hull and Howell, 1987), also 
called the 35S3 promoter; the bidirectional TR promoter 
which drives the expression of both the 1' and the 2 9 
genes of the T-DNA (Velten et al., 1984). 
Alternatively a promoter can be utilized which is not 
constitutive but specific for one or more tissues or 
organs of the plant. Given by way of example such kind 
promoters may be the light inducible promoter of the 
ribulose-1, 5-bi-phosphate carboxylase small subunit 
gene (US patent application 821, 582), if the expres- 
sion is desired in tissue with photosynthetic activity, 
or may be seed specific promoters. 
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A seed specific promoter is used in order to ensure 
subsequent expression in the seeds only. This nay be of 
particular use, since seeds constitute an important food or 
feed source. Moreover, this specific expression avoids possi- 
ble stresses on other parts of the plant. In principle the 
promoter of the modified 2S albumin can be used. But this is 
not necessary. Any other promoter serving the same purpose 
can be used. The promoter may be chosen according to its 
level of efficiency in the plant species to be transformed. 
In the examples below the 2S albumin promoter from the 2S 
albumin gene from Arabidopsls is used, which constitutes the 
natural promotor of the 2S albumin gene which is modified in 
said examples. Needless to say that other seed specific promo- 
tors may be used, such as the conglycinine promotor from 
soybean. If a chimeric gene is so constructed, a signal pep- 
tide encoding region must also be included, either from the 
modified gene or from the gene whose promotor is being used. 
The actual construction of the chimeric gene is done using 
standard molecular biological techniques described in Mania- 
tis et al., 1982. (see example). 

3. The chimeric gene construction is transferred into the 
appropriate host plant. 

When the chimeric or modified gene construction is com- 
plete it is transferred in its entirety to a plant transforma- 
tion vector. A wide variety of these, based on disarmed 
(non-oncogenic) Ti-plasmids derived from Acrrobacterium tumefa - 
ciens . are available, both of the binary and cointegration 
forms (De Blaere et al., 1987). A vector including a 
selectable marker for transformation, usually antibiotic 
resistance, should be chosen. Similarly, the methods of 
plant transformation are also numerous, and are fitted to 
the individual plant. Most are based on either protoplast 
transformation (Marton et al., 1979) or formation of a small 
piece of tissue from the adult plant (Horsch et al., 1985). 
In the example below, the vector is a binary disarmed 
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Ti-plasmid vector, the marker is kanamycin resistance, and 
the leaf disc method of transformation is used. 

Calli from the transformation procedure are selected on 
the basis of the selectable marker and regenerated to adult 
plants by appropriate hormone induction. This again varies 

5 

with the plant species being used. Regenerated plants are 
then used to set up a stable line from which seeds can be 
harvested. 

Further characteristics of the invention will appear in 
lfl the course of the non-limiting disclosure of specific exam- 
ples, particularly on the basis of the drawings in which: 

- Figs. 1, 2 and 3 refer to overall features of 
2S-albumins as already discussed above. The numbers 
refer to the number of aminoacids observed in the 
different fragments of the protein precursor. 

- Fig. 4 represents the sequence of lkb fragment con- 
taining the Arabldonsis thaliana 2S albumin gene and 
shows related elements. The Ndel site is underlined. 

- Fig. 5 provides the protein sequence of the large 
subunit of the above Arabidopsis 2S protein together 
with related oligonucleotide sequences. 

- Fig. 6A shows diagrammatically the successive phases 
of the construction of a chimeric 2S albumin Arabidop - 
sis thaliania gene including the deletion of practi- 
cally all parts of the hypervariable region and its 
replacement by a Accl site, the insertion of DNA 
sequences rich in methionine codons, given by way of 
of example in the following disclosure, in the Accl 
site, particularly through site-directed mutagenesis 
and the cloning of said chimeric gene in plant vector 
suitable for plant transformation. 

- Fig. 6B shows diagrammatically the protein sequence 
of the large subunit of several Arabidopsis 2S albu- 
mins and indicates the region removed from the genes 
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encoding said 2S albumins, and shows diagrammatically 
where an AccI site has been created and how oligonu- 
cleotides rich in methionine codons are inserted into 
said Accl site in such a way that the open reading 
frame is maintained. 

- pig 7 diagrammatically compares the protein sequenc- 
es of the large subunits of the unmodified 2S albu- 
min, in which most of the hypervariable region has 
been deleted, and those of the modified 2S albu- 
mins. The resulting number of methionine residues 
are indicated. 

- Fig. 8 shows the restriction sites and genetic map 
of a plasmid suitable for the performance of the 
above site-directed mutagenesis. 

- Fig. 9 shows diagrammatically the different steps of 
the site-directed mutagenesis procedure of Stanssens 
et al (1987) as generally applicable to the modifica- 
tion of nucleic acid at appropriate places. 

- Pig. 10 gives the restriction map of pGSC1703A. 

Example I : 

As a first example of the method described, a procedure 
.is given for the production of transgenic plant seeds with 
increased nutritional value by having inserted into their 
genome a modified 2S albumin protein from Axflbltiopgis 
thai tana having deleted its hypervariable region and re- 
placed by way of example by a methionine rich peptide hav- 
ing 7 aminoacids with the following sequence :I M M M M R 
M. A synthetic oligomer encoding said peptide is substitut- 
ed for essentially the entire part of the hypervariable 
region in a genomic clone encoding the 2S albumin of Axafei- 
dopsia thaliana . Only a few aminoacids adjacent to the 
sixth and seventh cysteine residues remained. This chimer- 
ic gene is under the control of its natural promoter and 
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signal peptide. The process and constructions are diagram- 
xnatically illustrated in Fig. 6A, 6B and 7. The entire 
construct is transferred to tobacco, Arab i dons is tha liana 
and Brassica nSLSSXB. plants using an Aorobacterium Mediated 
transformation system. Brassica n&fiUfi is of particular 
interest, since this crop is widely used as protein source 
for animal feed. 

Plants are regenerated, and after flowering the seeds are 
collected and the methionine content compared with untrans- 
formed plants. 

1. Cloning of the Arabidopsis thaliana 2S albumin aene. 



The Arabidopsis thaliana gene has been cloned accord- 
ing to what is described in Krebbers et al., 1988. The 
plasmid containing said gene is called pAT2Sl. The se- 
quence of the region containing the gene, which is called 
AT2S1, is shown in figure 4. 



2. Deletion of the hvoervarlable region of AT2S1 gene and 

replacement fry an Accl sitet 

Part of the hypervariable region of AT2S1 is replaced by 
the following oligonucleotide: 

5 f - CCA ACC TTG AAA GGT ATA CA C TTG CCC AAC - 3 f 

30-mer 



PTLK6IHLPN 

in which the underlined sequences represent the AccI site 
and the surrounding ones sequences complementary to the cod- 
ing sequence of the hypervariable region of the Arabidopsis 
2S albumin gene to be retained. This results finally in the 
aminoacid sequence indicated under the oligonucleotide. 



WO 



a 



10 



15 



20 



25 



30 



35 



23 

The deletion and substitution of part of the sequence encod- 
ing the hypervariable region of AT2S1 is done using site 
directed mutagenesis with the oligonucleotide as primer. The 
system of Stanssens et al. (1987) is used. 

The Stanssens et al. method is described in EP 87 402 384.4. 
It makes use of plasmid pMac5-8 whose restriction and genetic 
map and the positions of the relevant genetic loci are shown 
in Fig. 8. The arrows denote their functional orientation. 
fdT: central transcription terminator of phage fd; Fl-ORI: 
origin of replication of filamentous phage fl; ORI: 
ColEl-type origin of replication; BIA/Ap R : region coding 
for B-lactamase; CAT/Cm R : region coding for chlorampheni- 
col acetyl transferase. The positions of the amber mutations 
present in pMc5-8 (the bla-am gene does not contain the Seal 
site) and 

pMc5-8 t cat-am : the mutation eliminates the unique PvuII 
site) are indicated. Suppression of the sajfc amber mutation in 
both supE and supF hosts results in resistance to at least 25 
ug/ml Cm. pMc5-8 confers resistance to ±20 ug/ml and 100 
ug/ml Ap upon amber-suppression in sjip_E and SUSZ strains 
respectively. The EcoRI, Ball and Ncol sites present in the 
wild-type cat gene (indicated with an asterisk) have been 
removed using mutagenesis techniques. 

Essentially the mutagenesis round used for the above men- 
tioned substitution is ran as follows. Reference is made to 
Fig. 9, in which the amber mutations in the Ap and Cm select- 
able markers are shown by closed circles. The symbol * 
represents the mutagenic oligonucleotide. The mutation itself 
is indicated by an arrowhead. 

The individual steps of the process are as follows: 

- Cloning of the Hindlll fragment of pAT2Sl containing the 
coding region of the AT2S1 gene into pMa5-8 (I) . This 
vector carries on amber mutation in the cm R gene and 
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specifies resistance to ampicillin. The resulting plas- 
mid is designated pHacAT2Sl (see figure 6A step 1) . 

- Preparation of single stranded DNA of this recombinant 
(II) from pseudoviral particles. 

- Preparation of a Hindlll restriction fragment from the 
complementary pHc type plasmid (III) • pMc-type vectors 
contain the wild type Cm R gene while an amber mutation 
is incorporated in the Ap resistance marker. 

- Construction of gap duplex DNA (hereinafter called 
gdDNA) gdDNA (IV) by In XitTQ DNA/DNA hybridization. In 
the gdDNA the target sequences are exposed as single 
stranded DNA. Preparative purification of the gdDNA from 
the other components of the hybridization mixture is not 
necessary. 

- Annealing of the 30-mer synthetic oligonucleotide to the 
gdDNA (V). 

- Filling in the remaining single stranded gaps and seal- 
ing of the nicks by a simultaneous in vitro Klenov DNA 
polymerase I / DNA ligase reaction (VI) • 

- Transformation of a muts host, i.e., , a strain 
deficient in mismatch repair, selecting for Cm resis- 
tance. This results in production of a mixed plasmid 
progeny (VII) . 

- Elimination of progeny deriving from the template strand 
(pMa-type) by retransformation of a host unable to sup- 
press amber mutations (VIII) • Selection for Cm resis- 
tance results in enrichment of the progeny derived from 
the gapped strand, i.e., , the strand into which 
the mutagenic oligonucleotide has been incorporated. 

- Screening of the clones resulting from the retransforma- 
tion for the presence of the desired mutation. The re- 
sulting plasmid containing the deleted hypervariable 
region of AT2S1 is called pMacAT2SlC40 (see figure 6A 
step 2 ) . 
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3. Tneertion of aemieneefl rich in methionine eodona into 
»h« AT2S1 aene whoae sequences encoding the hVPervariable 
tfigian h»vo been deleted. 

As stated above when the sequences encoding most of the 
hypervariable loop were removed an AccI site was inserted in 
its place. The sequences of interest will be inserted into 
this AccZ site, but a second AccI site is also present in the 
HindZII fragment containing the modified gene. Therefore the 
Ndel-Hindlll fragment containing the modified gene is sub- 
cloned into the cloning vector pBR322 (Bolivar, 1977) also 
cut with Ndel and Hindlll. The position of the Ndel site in 
the 2S albumin gene is indicated in figure 4. The resulting 
subclone is designated pBRAT2Sl (Figure 6A, step 3) . 

In principle any insert desired' can be inserted into the 
AccZ site in pBRAT2Sl . In the present example said insert 
encodes the following sequence: Z.M.M.M. M.R.N. Therefore 
complementary oligonucleotides encoding said peptide are 
synthesized taking into account the codon usage of AT2S1 and 
ensuring the the ends of the two complementary oligonucleo- 
tides are complementary to the staggered ends of the AccI 
site, as shown here (the oligonucleotides are shown in bold 
type) ; 

5' GT ATA ATG ATG ATG ATG CGC ATG ATAC 3' 
3 • CA TAT TAC TAC TAC TAC GOG TAC TATG 5 ' 



The details of this insertion, showing how the reading 
frame is maintained, are shown in figure 6B. The two oligonu- 
cleotides are annealed and ligated with pBRAT2Sl digested 
with AccI (figure 6A, step 4) . The resulting plasmid is 
designated pA04. 
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4. Reconstruction of the complete modified AT2S1 ggnfi Kith 
its natural promoter t 

The complete chimeric gene is reconstructed as follows 
(see figure 6A): The clone pAT2SlBg contains a 3.6kb Bglll 
fragment inserted in the cloning vector pJB65 (Botterman et 
al., 1987) which encompasses not only the l.Okb Hindlll frag- 
ment containing the coding region of the gene AT2S1 but suffi- 
cient sequences upstream and downstream of this fragment to 
contain all necessary regulatory elements for the proper 
expression of the gene. This plasmid is cut with HindZZI and 
the 5.2kb fragment (i.e., that portion of the plasmid not 
containing the coding region of AT2S1) is isolated. The 
clone pAT2Sl is cut with HindZZZ and Ndel and the resulting 
320 bp HindZZZ-NdeZ fragment is isolated. This fragment 
represents the one removed from the modified 2S albumin in 
the construction of pBRAT2Sl (step 3 of figure 6A) in order 
to allow the insertion of the oligonucleotides in step 4 of 
figure 6A to proceed without the complications of an extra 
AccI site. These two isolated fragments are then ligated in 
a three way ligation with the Ndel-Hindlll fragment from pAD4 
(figure 6A, step 5) containing the modified coding sequence. 
Individual tranformants can be screened to check for appropri- 
ate orientation of the reconstructed HindZZZ fragment within 
the BglZZ fragment using any of a number of sites. The re- 
sulting plasmid, pAD17, consists of a 2S albumin gene modi- 
fied only in the hypervariable region, surrounded by the same 
flanking sequences and thus the same promoter as the unmodi- 
fied gene, the entirety contained on a BglZZ fragment. 

5. lamaCamatigD <?£ plants. 

The BglZZ fragment containing the chimeric gene is in- 
serted into the BglZZ site of the binary vector pGSC1703A 
(Fig. 10) (see also Fig. 6A step 6) • The resultant plasmid 
is designated pTAD12. Vector pGSC1703A contains functions 
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for selection and stability in both £• call and &. £uaelft- 
ciens . as well as a T-DNA fragment for the transfer of for- 
eign DNA into plant genomes (Deblaere et al., 1987). It fur- 
ther contains the bi-directional TR promotor (Velten et al., 
1984) with the neomycin phosphotransferase protein coding 
region (neo) and the 3* end of the ocs gene on one side, and 
a hygromycin transferase gene on the other side, so that 
transformed plants are both kanamycin and hygromycin resis- 
tant. This plasmid does not carry an ample ill in resistance 
gene, so that carbenicillin as well as claforan can be used 
to kill Agrobacterium after the infection step. Using stan- 
dard procedures (Deblaere et al., 1987), pTAD12 is trans- 
ferred to the Agrobacterium strain C58ClRif carrying the 
plasmid pMP90 (Koncz and Schell, 1986). The latter provides 
in trans the vir gene functions required for successful trans- 
fer of the T-DNA region to the plant genome. This Agrobacteri- 
um is then used to transform plants. Tobacco plants of the 
strain SRI are transformed using standard procedures (De- 
blaere et al., 1987). Calli are selected on 100 ug/ml kan- 
amycin, and resistant calli used to regenerate plants. 

The techniques for transformation of Arabidopsis thaliana 
and Brass lea pa pus are such that exactly the same construc- 
tion, in the same vector, can be used. After mobilization to 
Agrobacterium tumefaclens as described hereabove, the proce- 
dures of Lloyd et al., (1986) and Klimaszevska et al. (1985) 
are used for transformation of Arabidopsis and Brassica re- 
spectively. In each case, as for tobacco, calli can be se- 
lected on 100 ug/ml kanamycin, and resistant calli used to 
regenerate plants. 

In the case of all three species at an early stage of 
regeneration the regenerants are checked for transformation 
by inducing callus from leaf on media supplemented with kan- 
amycin (see also point 6) . 

6. screening and analysis of transformed Plants. 
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In the case of all three species, regenerated plants are 
grown to seed. Since different transformed plants can be 
expected to have varying levels of expression ("position 
effects" , Jones et al., 1985), more than one tranformant must 
initially be analyzed. This can in principle be done at 
either the UNA or protein level; in this case seed RNA was 
prepared as described (Beachy et al., 1985) and northern 
blots carried out using standard techniques (Thomas et al., 
1980) • Since in the case of both Brassica and Arabidopsis 
the use of the entire chimeric gene would result in cross 
hybridization with endogeneous genes, oligonucleotide probes 
complementary to the insertion within the 2S albumin were 
used; the same probe as used to make the construction can be 
used. For each species, 1 or 2 individual plants were chosen 
for further analysis as discussed below. 

First the copy number of the chimeric gene is determined 
by preparing DNA from leaf tissue of the transformed plants 
(Dellaporta et al., 1983) and probing with the oligonucleo- 
tide used above. 

The methionine content of the seeds is analyzed using 
known methods (Joseph and Marsden, 1986; Gehrke et al., 1985; 
Elkin and Griffith, 1985 (a) and (b) ) . 

Example II 

As a second example of the method described, the same 
procedure is followed for the production of transgenic plant 
seeds with increased nutritional value by having inserted 
into their genome a modified 2S albumin protein from Arabidop - 
sis tha liana having deleted its hypervariable region and 
replaced by way of example by a methionine rich peptide hav- 
ing 24 aminoacids with the following sequence : 
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I N M M Q P R C D M M M I M M N Q P R G M M M 

All different steps going from constructs to transformants 
as disclosed for example I are executed with the only differ- 
ence that in step 3 the following oligonucleotide has been 
synthesized and inserted into pBrAT2Sl 
(the oligonucleotides are shown in bold type) 

5' GT ATA ATG ATG ATG CAA CCA A6G GGC GAT ATG ATG ATG ATA 
ATG ATG ATG 

3' CA TAT TAC TAC TAC GTT GGT TCC COG GTA TAC TAC TAC TAT 
TAC TAC TAC 



CAA CCA AGG GGC GAT ATG ATG ATG ATA C - 3 • 
GTT GGT TCC COG CTA TAC TAC TAC TAT G - 5' 

The relevant plasmids are indicated in figure 6A ( details 
of the insertion in figure 6B and resulting aminoacid se- 
quence of the hybrid subunit shown in figure 7. The relevant 
plasmids as indicated in figure 6A are pAD3, pAD7 and pTADlO. 

The examples have thus given a complete illustration of 
how 2S albumin storage proteins can be modified to incorpo- 
rate therein an insert encoding a methionine rich polypeptide 
followed by the transformation of plant cells such as tobacco 
cells, Arabldopsis cells and Brassiea napus cells with an 
appropriate plasmid containing the corresponding modified 
precursor nucleic acid, the regeneration of the transformed 
plant cells into corresponding plants, the culture thereof up 
to the seed forming stage, the recovery of the seeds and 
finally the analysis of the methionine content of said seeds 
compared with the seeds of corresponding non transformed 
plants . 
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It goes without saying that the invention is not limited 
to the above examples. The person skilled in the art will in 
each case be able to choose the desired combination of appro- 
priate aminoacids to be inserted into the hypervariable re* 
gion of the 2S storage protein, in function of the plant he 
wants to improve with regard to its nutritional value and in 
function of the desired application of the modified plant. 

There follows a list of bibliographic references which 
have been referred to in the course of the present disclosure 
to the extent when reference has been made to known methods 
for achieving some of the process steps referred to herein or 
to general knowledge which has been established prior to the 
performance of this invention. All of the said articles are 
incorporated herein by reference. 

It is further confirmed 

- that plasmid pAT2Sl has been deposited with the DSM on 

4879 on October 7, 1988 

- plasmids pMa5-8 has been deposited with the DSM on 4567 
and pMc on 4566 on May 3, 1988 

- plasmid pAT2SlBg has been deposited with the DSM on 4878 
on October 7, 1988 

- plasmid pGSC1703a has been deposited with the DSM on 

4880 on October 7, 1988 

nowithstanding the fact that they all consist of constructs 
that the person skilled in the art can reproduce them from 
available genetic material without performing any inventive 
work. 



35 



31 



References : 

Altenbach, S.B., Pearson, K.W. , Leung, F.W., Sun, S.S.N 
(1987) Plant Mol. Biol. £, 239-250. 

Ampe C, Van Damme, J., de Castro, L.A.B., Sampaio, 
M.J. A.N. , Van Montagu, N. and Vandekerckhove, J. (1986) Eur. 
J. Biochem. 152, 597-604. 

Beachy, R.N., Chen, Z.-L. , Horsch, R.B., Rogers, S.G., Hoff- 
man, N.J. and Fraley, R.T. (1985) EMBO J. 1, 3047-3053. 
Bergman, L.W. and Kuehl, W.N. (1979) J. Biol. Chem. £51# 
5690-5694. 

Blobel, (1980) Proc. Natl. Acad. Sci. 22, 1496-1500. 
Bolivar, F«, Rodriguez, R.L. , Greene, P.J., Betlach, M.C., 
Heynecker, R.L. , Boyer, H.W. , Crosa, J.H. and Falkov, S. 
(1977) Gene 2, 95. 

Botterman, J. and Zabeau, M. (1987) DNA £, 583-591. 

Brown, J. (1976) Fed. Proc. Am. Soc. Exp. Biol. 15, 

2141-2144. 

Brown, J.W.S., Wandelt, Ch., Maier, U. , Dietrich, G. , 
Schwall, N. , and Feix, G. (1986) EMBO workshop "Plant Stor- 
age Protein Genes" Program and Abstracts page 19, Eds. J. 
Brown and G. Feix, University of Freiburg, 1986. 
Ghee, P.P., Klassy, R.C. and Slightom, J.L. (1986) Gene 41. 
47-57. 

Chrispeels, N.J. (1983) Planta 151, 140-152. 
Craig, S. and Goodchild, D.J. (1984) Protoplasma 122 . 35-44 
Crouch, M.L., Tembarge, X.M., Simon, A.E. and Ferl, R. 
(1983) J. Mol. Appl. Gen. 2, 273-283. 

De Blaere, R. , Reynaerts, A., Hofte, H., Hemalsteens, 
J. -P., Leemans, J. and Van Montagu, M. (1987) Methods in 
Enzymology 153 , 277-291 

De Castro, L.A.B., Lacerada, Z., Aramayo, R.A., Sampaio, 
M.J. A.M. and Gander, E.S. (1987) Mol. Gen. Genet. 206 . 
338-343. 



35 



WU VU/IWU.M 



r v. i / cro7/ uiu7 



32 

Dellaporta S.L. ; J.; Wood, J. and Hicks, B. (1983) Plant 
Molecular Biology Reports I, 19-21. 

Ellis, J.R. , Shirsat, A.H., Hepher, A. , Yarwood, J.N. , Gate- 
house, J. A., Croy, R.R.D. and Boulter, D. (1988) Plant Molec- 
ular Biology lfl, 203-214. 

El kin, R.G., and Griffith, J.E. (1985a) J. Assoc. Off. Anal. 
Chem. £&, 1028-1032. 

Elkin, R.G., and Griffith, J.E. (1985b) J. Assoc. Off. Anal. 
Chem. 6JL, 1117-1127. 

Ericson, M.L. , Rodin, J., Lenman, M. , Glimeliums, K. , 
Lars-Goran, J. and Rak, L. (1986) J. Biol. Chem. 2&1, 14 
576-14 581. 

Greenwood, J.S. and Chrispeels, M.J. (1985) Plant Physiol. 
21, 65-71. 

Gehrke, C.W., Hall, L.L. , Absheer, J.S., Kaiser, F.E. and 
Zumwalt, R.N. (1985) J. Assoc. Off. Anal. Chem. £&, 811-821. 
Herman, E.M. , Shannon, L.M. and Chrispeels, M.J. (1986) In 
Molecular Biology of Seed Storage Proteins and Lectins. L.M. 
Shannon and M.J. Chrispeels Eds., American Society of Plant 
Physiologists. 

Higgins, T.J.V. (1984) Ann. Rev. Plant Physiol. 25l, 191-221. 
Higgins, T.J.V. , Llewellyn, D. , Newbigin, E. and Spencer, D. 
(1986) EM BO workshop "Plant Storage Protein Genes" Program 
and abstract page 19, Eds. J. Brown and G. Feix, University 
of Freiburg, 1986. 

Horsch, R.B., Fry, J.E., Hoffmann, N.L. , Eichholtz, D. , 

Rogers, S.G. and Fraley, R.T. (1985) Science 222, 1229-1231. 

Hoffman, L.M. , Donaldson, D.D. , Bookland, R. , Rashka, K. , 

Herman, E.M. (1987) EM BO J. £, 3213-3221. 

Hull and Howell (1987) : Virology, £&, 482-493 

Jagodinski, L. , Sargent, T. , Yang, M., Glackin, C, Bonner, 

J. (1987) Proc. Natl. Acad. Sci. USA. 7_8_, 3521-3525. 

Jones J.D.G.; Dunsmuir, P. and Bedbrook, J. (1985) EMBO J. A 

(10), 2411-2418. 



35 



WO 90/04032 



• V» M t ft** %f*t w ■ mm* 



33 



Joseph, H. and Marsden, J. (1986) "HPLC of Snail Molecules - 
A practical approach N in: ZRL Press Oxford - Washington D.C. 
"Amino Acids and Snail Peptides" Ed.: Kin , C.K., 13-27. 
Josefsson, L-G. ; Lennan, M. , Ericson, M.L. and Rask,L. 
(1987). J. Biol. Chen. 2£2 (25), 12196-12201. 
Klinaszevska, K. and Keller, H.A. (1985) Plant Cell Tissue 
Organ Culture, A, 183-197. 

Krebbers, E. , Her dies, L. , De Clercq, A., Seurinck, J., 
Leenans, J., Vandanme, J., Segura, M. , Gheysen, G., Van 
Montagu M. and Vandekerckhove , J. (1988) Plant Physiol. 
22(4), 859-866. 

Koncz, C. and Schell, J. (1986) Mol. Gen. Genet. 204 . 
383-396. 

Larkins B.A. and Hurknan,H.J. (1978) Plant Physiol. £2, 
256-263. 

Lloyd, A.M., Barnason, A.R. , Rogers, S.G., Byrne, M.C, 
Fraley, R.T. and Horsh, R.B. (1986) Science 224, 464-466. 
Lord, J.M. (1985). Eur. J. Biochen. 146 . 403-409. 
Maniatis, T. , Fritsch, E.F. and Sanbrook, J. (1982) Molecu- 
lar Cloning . Cold Spring Harbor Laboratory, Cold Spring 
Harbor, New York. 

Marris, C, Gallois, P., Copley, J. and Kreis, N. (1988) 
Plant Molecular Biology 12, 359-366. 

Marton, L. , Mullens, G.J., Molendijk, L. and Schilperoort, 
R.A. (1979) Nature, 222# 129-131. 

Morinaga, T., Sakai, N. , Wegmann, T., Tanaoki, T. (1983) 
Proc. Natl. Acad. Sci. SSL. 4604-4606. 

Odell, J.T., Nagy, J. and Chua, N.M. (1985) Nature 212, 
810-812. 

Okanuro, J.K., Jofuku, K.D. and Goldberg, R.B. (1986) Proc. 
Natl. Acad. Sci. USA. 22* 8240-8244. 

Perlnan, D. and Halvorson, H.o. (1983) J. Mol. Biol. l£2, 
391-409. 



35 



fci/tnjy/oizxy 



34 

Radke, S.E., Andrews, B.N. , Moloney, M.M., Crouch, M.L. 
Kridl, J.C. and Knauf, V.C. (1988) Theor. Appl. Genet. 2£, 
685-694. 

Roden, L.T., Miflin, B.J., Freedman, R.B. (1982) FEBS Lett. 
12ft, 121-124. 

Scofield, s.R. and Crouch, M.L. (1987) J. Biol. Chem. 262 
(25), 12202-12208. 

Sengupta-Gopalan, C. , Reichert, N.A., Barker, R.F., Hall, 
T.C. and Kemp, J.O. (1985) Proc. Natl. Acad. Sci. USA £2, 
3320-3324. 

10 

Sharief, S.F. and Steven, S.-L. (1982) J. Biol. Chem. 
252(24), 14753-14795. 

Slightom, J.L. and Chee, P.P. (1987) Biotechn. Adv. 5., 
29-45. 

Stanssens, P., NcKeovn, Y. , Friedrich, K. , and Fritz, H.J. 
(1987) Manual EMBO Laboratory Course; 'Directed mutagensis 
and protein engineering' held at Max Planck Institute f±r 
Biochemie, Martinsried, W-Germany, July 4-18, 1987. 
Staswick, P.E. (1988) Plant Physiol. £2, 250-254. 
2Q Thomas, P.S. (1980) Proc. Natl. Acad. Sci. 22, 5201. 

Velten, J., Velten, L. , Hain, R. and Schell, J. (1984) EMBO 
J. 2, 2723-2730 

Nailing, L. ; Drews, G.N. and Goldberg, R. (1986) Proc. Natl. 

Acad. Sci. £2, 2123-2127. 

Yang, F., Luna, V.G., McAnelly, R.D., Noberhaus, K.H. , Cup- 

25 

pies, R.L. , Bowman, B.H. (1985) Nucl. Acids Res. 13 . 
8007-8017. 

Youle, R. and Huang, A.H.C. (1981) American J. Bot. 68 , 
44-48. 

30 



35 



WO90/U4U3Z 



- is- 



2S Albumin As % Of Total Seed Protein 



TABLE 1 



Family, species % 
(common name) 


Compositac 

Helianthus annuus 
(sunflower) 


62 


« 

Cruciferae 

Brassica spp. 
(mustard) 


62 


Linaceae 

Linutn usitatissimum 

• 

(linseed) 


42 


Leguminosae 

Lupinus polyphyllus 
(lupin) 


38 


Arachis hypogaea 
(peanut) 


20 


Lecythidaceae 

Bertholletia excelsa 
(brazil nut) 


30 


Liliaceae 

Yucca spp. 
(yucca) 


27 


Euphorbiaceae 

Ricinus communis 
(castor bean) 


44 



From Youle and Huang, 1981 



WO 9U/U40J2 



nj/crsy/uirar 



3G 



10 



15 



20 



25 



30 



CLAIMS 

1. A process for producing transgenic plants with increased 

nutritional value which comprises : 

- cultivating plants obtained from regenerated plant 
cells or from seeds of plants obtained from said 
regenerated plant cells over one or several genera- 
tions, wherein the genetic patrimony or information 
of said plant cells, replicable within said plants, 
includes a nucleic acid sequence, encoding a modi- 
fied 2S albumin storage protein derived from a natu- 
ral 2S albumin storage protein and placed under the 
control of a promotor capable of directing gene 
expression in said plants; 

• wherein said nucleic acid encodes at least part of 
the precursor of said 2S albumin including its 
signal peptide or a signal peptide of another 2S 
albumin, said nucleic acid being hereafter referred 
to as the "precursor encoding nucleic acid 19 

• wherein said nucleic acid contains a nucleotide 
sequence (hereafter termed the "relevant sequence 19 ), 
which relevant sequence comprises a nonessential 
region of said 2S albumin modified by a heterologous 
nucleic acid insert or substitution for part of said 
nonessential region, said insert or substitution 
forming an open-reading phase with the non-modified 
parts surrounding said insert in said relevant se- 
quence. 

wherein said insert or substitution encodes a 
polypeptide formed of aminoacids, identical or 
different from one another, with a proportion of at 
least one appropriate aminoacids selected from 
lysine , methionine , tryptophane , threonine , 
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phenyl-alanine, leucine, valine, isoleucine and 
arginine in a proportion sufficient that said 
modified 2S albumin storage protein is enriched 
in at least one of said appropriate aminoacid 
with respect to the contents of the same appro- 
priate aminoacid in the natural storage protein. 

2. The process of claim 1 wherein said modified 2S 
albumin storage protein is derived from a natural 2S stor- 
age protein which is itself foreign to the transgenic 
plant. 

3. The process of claim 2 wherein said transgenic 
plant is a plant which has been transformed with a recombi- 
nant vector, e.g., a Ti-plasmid derived vector which con- 
tained said transformed storage protein. 

4. The process of any of claims 1 to 3 wherein the 
modified storage protein is derived of a group of storage 
proteins obtained from the following plants: 

aicinus communis 
ftrafrldopglg thflliana 

Braasica napus 

Bertholletia ffiKgfllglfl 
5. The process of any of claims 1 to 4 wherein said modi- 
fied storage protein contains a number at least of said 
appropriate aminoacid, e.g. , lysine, methionine or tryp- 
tophane greater by at least 2, preferably by 4 than the 
n umb er of the same appropriate aminoacid in the non modi- 
fied storage protein. 

6. The process of any of claims 1 to 5 wherein said 
insert is located in the region which extends in the non 
modified 2S albumin between the codons which code for the 
sixth and the seventh cysteine residues respectively, 
whereby the group formed of 3 codons, preferably 6 codons 
respectively, next to each of those which code for said 
sixth and seventh codons and within said region code for 
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the sane aminoacids as corresponding groups of 3, prefera- 
bly 6 codons in the non modified storage proteins. 

7. The process of any of claims 1 to 6, wherein 
said promoter is the natural promotor of said nucleic 
acid. 

8. The process of any of claims 1 to 6, wherein 
said promoter is heterologous with respect to said nucleic 
acid. 

9. The process of any of claims 1 to 6 wherein said 
promotor is a constitutive promotor. 

10. The process of any of claims 1 to 6 wherein 
said promotor is a tissue specific promotor. 

11. The promotor of claim 10 which is a seed 
specific promotor. 

12. The process of any of claims 1 to 11, wherein 
the heterologous insert is foreign to the natural nucleic 
acid encoding the precursor of said 2S albumin. 

13. The process of any one of claims 1 to 12, 
wherein the heterologous insert contains a segment as 
above-defined normally present in the genetic patrimony or 
information of said seeds or plant cells, the 
"heterologous" character of said insert then addressing to 
the one or several codons which surround it, on both 
sides thereof and which link said segment to the non 
modified parts of the nucleic acid encoding said 
precursor. 

14. A recombinant DNA which includes a nucleic acid 
seguence # which can be transcribed into the mRNA encoding 
at least part of the precursor of a 2S albumin including 
the signal peptide of said plant, said nucleic acid being 
hereafter referred to as the "precursor encoding nucleic 
acid" : 



. wherein said nucleic acid contains a nucleotide 



sequence 
sequence 



(hereafter termed the "relevant 
"), which relevant sequence comprises a 
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nonessential region modified by a heterologous 
nucleic acid insert forming an open reading 
frame in reading phase with the non modified 
parts surrounding said insert in said relevant 
5 sequence. 

. wherein said insert includes a nucleotide 
segment encoding a polypeptide containing said, 
appropriate aminoacids as defined in any of 
claims 1 to 12. 

iq . wherein said precursor coding nucleic acid is 

placed under the control of a promoter capable 
of directing gene expression in plants. 

15. The recombinant DMA of claim 14 which is a 
plasmid. 

16. The recombinant DMA of claim 15 which is 
capable of transforming plant cells and of causing the 
replication of said modified precursor nucleic acid 
sequence in said plant cells. 

17. The recombinant DNA of claim 16 which is a 
Ti-derived plasmid. 

18. As a regenerable source of appropriate 
aminoacids with high nutritional value, which is formed of 
either plant cells of a seed- forming plant, which plant 
cells are capable of being regenerated into the fullplant 
or seeds of said seed-forming plants wherein said plants 
or seeds have been obtained as a result of one or several 
generations of the plants resulting from the regeneration 
of said plant cells, wherein further the DNA supporting 
the genetic information of said plant cells or seeds 
comprises a nucleic acid or part thereof, including the 
sequences encoding the signal peptide, which can be 
transcribed in the mRNA corresponding to the precursor of 
a 2S albumin in said plant, placed under the control of a 
promotor capable of directing gene expression in plants, 
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• wherein said nucleic acid sequence contains a 
relevant modified sequence encoding the nature 
2S albumin or one of the several sub-sequences 
encoding for the corresponding one or several 
sub units of said mature storage protein, 

• wherein further the modification of said 
relevant sequence takes place in one of its non 
essential regions and consists of a heterologous 
nucleic acid insert forming an open-reading 
frame in reading phase with non modified parts 
which surround said insert in the relevant 
sequence, 

• wherein said insert or substitution is as 
defined in any of claims 1 to 12. 

19. The source of polypeptide of claim 18, wherein 
said insert in a synthetic man-made oligonucleotide. 

20. The source of polypeptide of claim 18, wherein 
said insert is obtained from a prokaryotic or eukaryotic 
organism. 

21. The source of polypeptide of claim 18, 19 or 
14, wherein the heterologous segment contained in said 
insert encodes a non plant variety specific polypeptide. 
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Fig. 6A: flowchart of constructions show- 
ing successive steps in the deletion of se- 
quences encoding most of the hypervariable 
region of the Arabidopsis 2S and their re- 
placement with sequences rich in methion- 
ine codons 
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Fig. 6A (cont.) 
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